Skip to content

Update NA repr #30821

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Jan 9, 2020
Merged

Update NA repr #30821

merged 9 commits into from
Jan 9, 2020

Conversation

TomAugspurger
Copy link
Contributor

Closes #30415

In [2]: df = pd.DataFrame({"A": pd.array([1, 2, None])})

In [3]: df
Out[3]:
      A
0     1
1     2
2  <NA>

In [4]: df.A
Out[4]:
0       1
1       2
2    <NA>
Name: A, dtype: Int64

In [5]: df.A.array
Out[5]:
<IntegerArray>
[1, 2, <NA>]
Length: 3, dtype: Int64

I think adding color with ANSI codes / custom HTML formatting is still worth doing as in #30778, but this is an improvement for now.

@TomAugspurger TomAugspurger added this to the 1.0 milestone Jan 8, 2020
@@ -354,10 +354,7 @@ class NAType(C_NAType):
return NAType._instance

def __repr__(self) -> str:
return "NA"

def __str__(self) -> str:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python will automatically use __repr__ if one isn't defined, so this was redundant.

@TomAugspurger
Copy link
Contributor Author

I had to include <NA> in the list of default NA values for read_csv to get this working. Probably OK to do. Also maybe simplified ExtensionBlock.to_native_types, but may have broken something along the way.

@WillAyd
Copy link
Member

WillAyd commented Jan 8, 2020

On board with this once green

@jreback jreback added Docs Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate and removed Docs labels Jan 9, 2020
@jreback
Copy link
Contributor

jreback commented Jan 9, 2020

some doc-test failures, otherwise lgtm.

@@ -575,6 +575,7 @@ Other API changes
Supplying anything else than ``how`` to ``**kwargs`` raised a ``TypeError`` previously (:issue:`29388`)
- When testing pandas, the new minimum required version of pytest is 5.0.1 (:issue:`29664`)
- :meth:`Series.str.__iter__` was deprecated and will be removed in future releases (:issue:`28277`).
- Added ``<NA>`` to the list of default NA values for :meth:`read_csv` (:issue:`30821`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

may want to also mention in the NA section

@jreback
Copy link
Contributor

jreback commented Jan 9, 2020

I think you need to update the tests if a nullable integer / bool / string array appears in Series/DataFrame? (or does this not work yet)?

@TomAugspurger
Copy link
Contributor Author

It works. We have smoke tests in the base class. Will add some dedicated ones.

@TomAugspurger
Copy link
Contributor Author

Passing now. Merging in an hour.

@TomAugspurger TomAugspurger merged commit 493363e into pandas-dev:master Jan 9, 2020
@TomAugspurger TomAugspurger deleted the na-repr branch January 9, 2020 16:19
@jorisvandenbossche
Copy link
Member

I had to include in the list of default NA values for read_csv to get this working.

Can you explain why? Pandas does not write such format to CSV files, so it shouldn't be needed for a roundtrip

@@ -1230,7 +1230,7 @@ def _format(x):
if x is None:
return "None"
elif x is NA:
return "NA"
return formatter(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that the formatter function needs to be able to handle NAs, which now is maybe not the case?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do a PR for this, I think this breaks geopandas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DISCUSS: disambiguation of NA and "NA" in reprs
4 participants